Regulation by small RNAs via coupled degradation: mean-field and variational 
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Regulatory genes called small RNAs (sRNAs) are known to play critical roles in cellular responses 
to changing environments. For several sRNAs, regulation is effected by coupled stoichiometric 
degradation with messenger RNAs (mRNAs). The nonlinearity inherent in this regulatory scheme 
indicates that exact analytical solutions for the corresponding stochastic models are intractable. 
Here, we present a variational approach to analyze a well-studied stochastic model for regulation by 
sRNAs via coupled degradation. The proposed approach is efficient and provides accurate estimates 
of mean mRNA levels as well as higher order terms. Results from the variational ansatz are in 
excellent agreement with data from stochastic simulations for a wide range of parameters, including 
regions of parameter space where mean-field approaches break down. The proposed approach can 
be applied to quantitatively model stochastic gene expression in complex regulatory networks. 

PACS numbers: 87.10.Mn, 02.50.r, 82.39.Rt, 87.17.Aa, 45.10.Db 
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A new paradigm for cellular regulation has emerged in 
recent years with the discovery of novel non-coding genes 
called small RNAs (sRNAs). In bacteria, sRNAs often 
function as global regulators that mediate cellular adap- 
tation to changing environments In higher organisms, 
the corresponding genes (microRNAs) are known to play 
key roles in the regulation of critical processes such as 
development, stem cell pluripotency and cancer 0, It 
has been proposed that one of the key functions of sR- 
NAs in controlling cellular processes is to regulate the 
variability (noise) in gene expression Q. Recent experi- 
mental developments have led to approaches for quantify- 
ing such variability using single-molecule measurements 
of mRNA levels [J]. These technological advances have 
now made possible experimental studies that analyze the 
roles of sRNAs in noise regulation during important pro- 
cesses such as development. Correspondingly, there is a 
need for theoretical approaches that complement such ex- 
perimental efforts to enable a quantitative understanding 
of different mechanisms of sRNA-based regulation. 

While the molecular mechanisms of sRNA-mediated 
regulation continue to be investigated, one established 
mechanism, representative of several bacterial sRNAs, 
corresponds to binding with mRNAs followed by cou- 
pled stoichiometric degradation An important chal- 
lenge for current research is to analyze how this regula- 
tory mechanism impacts the variability of gene expres- 
sion across a population of cells. Several recent the- 
oretical studies |6l-[ll| have analyzed models based on 
the corresponding reaction scheme (shown in Fig. lA). 
The nonlinearity inherent in this reaction scheme im- 



plies that exact analytical solutions for the correspond- 
ing stochastic model are intractable; thus approximate 
analytical approaches are needed. Previous theoretical 
studies have primarily focused on mean-field (MF) ap- 
proaches and on steady-state distributions using expan- 
sions around MF solutions. However, MF approaches 
are not accurate when we have a combination of nonlin- 
ear reaction rates (due to interaction with small RNAs) 
and low mRNA/sRNA levels, which points to the need 
for development of alternative analytical approaches. 

In this paper, we analyze stochastic models of sRNA- 
based regulation via coupled degradation (as shown in 
Fig. lA). We first discuss the MF approximation, which 
corresponds to neglecting mRNA-sRNA correlations, and 
define dimensionless variables that are useful in quantify- 
ing deviations between MF results and data from stochas- 
tic simulations. To go beyond MF, we use a variational 
approach which has been success fully ap plied to gene reg- 
ulatory networks in recent work 12Hla| . Within this ap- 
proach, we present a general ansatz for the steady-state 
probability distribution which, at the simplest level, re- 
duces to the MF approximation. At the next level, the 
variational ansatz gives results that are in excellent agree- 
ment with data from simulations for the mean and vari- 
ance of the regulated mRNA distribution. The proposed 
method can be used for efhcient and accurate quantita- 
tive analysis of sRNA-based regulation of gen expression. 

We begin by considering the kinetic scheme presented 
in Fig. lA. The probability distribution of mRNA and 



2 



(A) 



— sRNA 





FIG. 1; A) The kinetic scheme for regulation of mRNA by 
smaU RNAs with coupled degradation rate 7. B) The ratio 
X — {m)/nm, obtained from simulation data, is plotted as 
a function of rim and Us- Parameters are chosen such that 
Em = Es = 1 and 7=1. For nm,ns 2> 1, X converges 
towards the MF preduction {X ~ 0.618). 



sRNA levels per cell, Pm,s{t), obeys the master equation: 

(to + l)P,„+i,s + l.ls{s + l)P,n^s+i 

+ 7(m + l)(s + l)P„+i^,+i 

- {km + kg + ^Irnm + ^is■S +"fms) Pm,s, 

where kj, iij {j ~ m,s) and 7 are the parameters de- 
fined in Fig. lA. We will focus on the stationary dis- 
tribution, denoted by j. . It is convenient to define 
the following set of independent dimensionless parame- 
ters: em = ksj/flmtJ-s, Cs = Knl / fJ-mfJ-s and Hj = kj / flj 

(j = TO, s). From the master equation ([1]), we can ex- 
plicitly relate the average mRNA and sRNA levels to the 
correlation term (tos) 3j 121 via: 



^ / _ (to) 
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where (.} denotes the stationary average. More gener- 
ally, moments at one level are coupled to higher-order 
moments due to the nonlinear interaction term. This 
hierarchy makes the exact solution of the master equa- 
tion intractable. Defining X = {m)/nm, Y ~ {s)/ns and 
C = {ms)/{m){s), equation (0) leads to 
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= C XY. 



(3) 



Traditionally, a first approximation, known as the 
MF approximation, consists of neglecting correlations 
through the substitution (tos) -> (to)(s). The MF as- 
sumption thus corresponds to C = 1 and leads to 



e,„A:y + a: - 1 = 0, e,xr + y-i = o. 



(4) 



Comparing Eqns. (3) and (4), we see that the exact 
means (i.e. solutions of Eqn. (3)) are generated by the 



MF solutions considered with the rescaled interaction pa- 
rameter 7' = C7. Determination of C can therefore pro- 
vide accurate estimates of the mean mRNA and sRNA 
levels. The ratio C is also an indicator of the accuracy of 
MF: it is a good approximation when C ~ 1, whereas de- 
viations from unity indicate that better approximations 
are needed. 

Furthermore, note that X and Y are, in general, func- 
tions of the four parameters e™, e^, rim and ris] however 
the MF approximation (Eq. 4) predicts that both quanti- 
ties depend only on and Cg. It follows that MF theory 
breaks down in regions of parameter space where X and 
Y depend on the parameters rim and rig (for fixed and 
Eg). These regions are indicated by significant deviations 
between the exact ratio X (Y) and the solution A+ (A_) 
ofEq. dlD. 

We now anlayze deviations of the MF results from 
stochastic simulations data obtained using the Gillespie 
algorithm Is"] . The ratios X and C are plotted in figures 
IB and 2A respectively. These data are presented as a 
function of Um and n^, keeping Cm and constant. The 
figures indicate that both quantities converge towards the 
MF predictions in the limit rig, rim ^ I {X 0.618 
and C ^ 1). More significantly, the data shows that 
MF is not a good approximation for small rim and rtg. 
This is important to note since, in several cellular sys- 
tems, mRNA abundances can be low (i.e. rim is small) 
[lit . This indicates that more accurate approximation 
are needed in such cases. 

Furthermore, in the uncorrelated approximation, the 
stationary probability distribution can be written as the 
product of Poisson distributions 11^+ (to) x (s) , where 
^x{n) = e~^x"/nl. Defining the marginal distribu- 
tions P„*, = EsPLs and P; = the ratio 

'^j = 0)/(0^)~0) ) U = to, s) measures deviations 
between the marginals (P*) and the simple Poisson dis- 
tribution. Again, deviations of D = x dm fi'om unity 
reveal that both marginal probability distributions can- 
not be approximated by the Poisson distribution. In Fig. 
2B, stochastic simulations data indicate that the coeffi- 
cient D deviates significantly from one for large rim and 
Ug. This observation implies that higher-order terms, 
such as (to^) and (s^) cannot be obtained using the MF 
prediction (j^) — (j)^ = (j) {j = to, s), even in regions of 
parameter space for which the mean values are given ac- 
curately by the MF approximation. Interestingly, it is for 
small parameter values rij (j = m, s), for which the MF 
approximation does not give accurate mean values, that 
D converges to one. This observation is an indication 
that the Poisson distribution is in some way embedded 
in the structure of Pm,s- 

Based on the preceding analysis, it seems natural to 
approximate Pm ^ as a superposition of Poisson distribu- 
tions. This approximation can be implemented using the 
variational method introduced by Eyink f20|, combined 
with the quantum Hamiltonian formalism of the master 
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FIG. 2: Stationary value oi C — {ms)/{m){s) (A) and 
D — dm X ds (B), obtained from simulation data, plotted 
as a function of rim and n,. We keep = es = 1 and 7 = 1. 



equation [l^, [l3|. Following the mapping outlined by 
Doi [2l[, we define the operators and a (respectively 
and b) associated with the creation and annihilation 
of mRNA (sRNA). The master equation ([T]) takes the 
compact form dt\ip{t)) = —£\4>{t)) with 

£ = km + kg + iij^a^a + fisb^b + ja^ab^b 

- {kmO^ + fcs&^ + Mm" + Ms^ + 7"^) - (5) 

where [a, at] = [6,6+] = 1 and a|0,s) = 0, 6|m,0) = 0. 
Focusing on the stationary state, we denote by {tpL\ and 
iV'fl) the left and right eigenstates with vanishing eigen- 
value. They obey (i/'iln, to) = (V'-lIV'-r) = 1- The 
mapping to the original problem is given by P,*j ^ — 
(to, s|'(/'i?)/'Ti!s!. 

To initiate the variational ansatz, we define the left 
and right trial vectors (((/)l(Al)| and |</)ii;(Ai?))), con- 
structed using a set of independent parameters, Kl and 
Kr. Defining the functional H(Al, Afl) = ((^L|£|0i?), the 
eigenstates are determined using the variational principle 
STi. = 0. A detailed explanation of the variational scheme 
is provided in . 

We now generalize the uncorrelated approximation to 
propose a specific ansatz for the trial vectors as the su- 
perposition of Poisson distributions. A similar ansatz has 
also been proposed in a recent study of reaction systems 
including different chemical species 15| . We define 



with Kr = {ofp,^,, 9p,g} and A^ = {dp.q} i&d.d = 0). 
In each vector, the total number of parameters Af is 
given by A/" = d{d + 2). The parameters of (0l| are 
imposed by the condition ((f)L\m,n) ^ 1 which leads to 
9p^q = 0, Vp, q. It follows that the set Ar is solution of 
{d(j)L\C\4'R)\Ai,={o} — 0- Our calculation leads to the sys- 
tem of equations: 



-fmses(l — nm/ctp) + 



{ij + + jap) (8) 
(1 ~ ns//3,)] = 0, 



.(Ai)| = (0,0|e 



a+b 



i,J=0 



(6) 



generated for i,j — 0, 1,2, ...,d with the pair {i = d,j — 
d) excluded. The first equation (for i = j = Q), corre- 
sponds to the probabihstic interpretation: (^lI^a) = 1 
and leads to the normalization constraint ^ 0p,<? = 1 • 
From equation ([8]) one can then generates the M in- 
dependent conditions required to determine the right 
eigenvector parameters. It follows that an approxima- 
tion of the stationary distribution is given by g = 
(TO,s|0fl(A|j))/TO!s!, where A^ = {a;,/3*,e;_J is so- 
lution of ([8]). The latter distribution can be explic- 
itly written as a superposition of Poisson distributions: 
'P*m.,s = Ep,g0;,gn„j(TO)n^.(s). We note that the 
MF results are recovered by considering the ansatz with 
d—1. In this case, ^ is simply a product of two Pois- 
son distributions with means a and (3 respectively. The 
variational equations give nsijim — a) — tmOiP = and 
nm{ns — /3) — e„ia/3 = 0, leading to X = X+ (y = A_) 
and C = D = 1. 

Going one step beyond the MF approximation, we con- 
sider the ansatz ([7]) with = 2. We first consider the 
symmetric case km — kg — k and = IJ-s = jJ" This 
choice imposes aj — Pj {j = 1,2) and <di^2 = 02,i- The 
set A^, solution of the equations generated by ([8]), is 
obtained numerically using standard routines. From a 
practical point of view, the numerical calculation is sig- 
nificantly faster than stochastic simulations, especially if 
we need to explore large regions of parameter space. 

Figure 3^ presents a comparison of our results with 
data from stochastic simulations. Keeping the ratios 
and Cs constant, the quantities X, C and D are plotted 
as a function of for 7 = 1, 5 and 7 = 10. Clearly, 
deviations from MF results appear more pronounced as 
7 increases. However, for a range of parameter values /i 
and even for large mRNA-sRNA coupling, the variational 
scheme gives accurate values of the mean mRNA level per 
cell ((m) = X X Tim)- Additionally, we checked that the 
predictions for (s) also presents an excellent agreement 
with simulation data. Importantly, the agreement of our 
predictions with simulation data, for the quantities C and 
D, shows that the variational method also gives accurate 
values of higher order terms such as the correlation (tos) 
(= C X (to)(s)) and variance (j^) - (j)^ (= {j)/dj). 
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FIG. 3: Comparisons of simulation data (symbols), ansatz 
predictions (lines) and MF results (dashed line). (A) The 
quantities X = {m)/nm, C = {ms)/{m){s) and D — ds x dm 
are plotted as a function of /i = /im = /^s on a logarithmic 
scale, for 7 = 1 (circles), 7 = 5 (squares), and 7 = 10 (dia- 
monds). We keep = = 1 with = ks — k. (B) The 
quantities X (left) and C (right) are plotted as a function of 
fim on a logarithmic scale, for = 4 (top), em ~ 1 (middle) 
and em ~ 1/4 (bottom). We keep /^s = 2, 7 = 1 and es = 1. 



To compare our results in the non-symmetric case, we 
consider variations in fi^, keeping fis = 2 and 7=1 
fixed. The set of parameters is once again computed 
numerically, solving 8 coupled equations generated from 
equation ([8|). The ratio is kept equal to unity while 
Em = 4, 1 and 1/4. As shown in Fig. 3B, the ansatz 
predictions are, once again, in excellent agreement with 
simulation data. 

In conclusion, we have presented a variational ap- 
proach for analyzing a coupled degradation mechanism 
of sRNA-based regulation. The latter method generates 
a set of algebraic equations that can be solved numeri- 
cally. At the simplest level, the approach reduced to the 
MF approximation which is shown to be inaccurate for 
low abundances of the interacting components. The ap- 
proach proposed allows for systematic improvements over 
MF and, at the next level, gives excellent agreement with 
simulation data for the mean and variance of steady-state 
mRNA/sRNA distributions. The results derived will aid 
approaches for inference of model parameters from exper- 



imental measurements of mean and variance. More gen- 
erally, the proposed approach can be extended to treat 
other biological networks with nonlinear interactions for 
which analytical solutions of the corresponding stochat- 
sic models are intractable. In such cases, the proposed 
procedure of constructing the variational ansatz (i.e. su- 
perposition of MF probability distributions) can lead to 
accurate estimates of the mean and variance for quanti- 
ties of interest. It is hoped that future work coupling such 
approaches with experiments will lead to quantitative un- 
derstanding of gene expression in complex networks. 
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